To Ann - Sofie
نویسنده
چکیده
To reduce latency and increase bandwidth to memory, modern microprocessors are designed with deep memory hierarchies including several levels of caches. For such microprocessors, the service time for fetching data from off-chip memory is about two orders of magnitude longer than fetching data from the level-one cache. Consequently, the performance of applications is largely determined by how well they utilize the caches in the memory hierarchy, captured by their miss ratio curves. However, efficiently obtaining an application’s miss ratio curve and interpreting its performance implications is hard. This task becomes even more challenging when analyzing application performance on multi-core processors where several applications/threads share caches and memory bandwidths. To accomplish this, we need powerful techniques that capture applications’ cache utilization and provide intuitive performance metrics. In this thesis we present three techniques for analyzing application performance, StatStack, StatCC and Cache Pirating. Our main focus is on providing memory hierarchy related performance metrics such as miss ratio, fetch ratio and bandwidth demand, but also execution rate. These techniques are based on profiling information, requiring both runtime data collection and post processing. For such techniques to be broadly applicable the data collection has to have minimal impact on the profiled application, allow profiling of unmodified binaries, and not depend on custom hardware and/or operating system extensions. Furthermore, the information provided has to be accurate and easy to interpret by programmers, the runtime environment and compilers. StatStack estimates an application’s miss ratio curve, StatCC estimates the miss ratio of co-running application sharing the last-level cache and Cache Pirating measures any desired performance metric available through hardware performance counters as a function of cache size. We have experimentally shown that our methods are both efficient and accurate. The runtime information required by StatStack and StatCC can be collected with an average runtime overhead of 40%. The Cache Pirating method measures the desired performance metrics with an average runtime overhead of 5%.
منابع مشابه
Keratin 8 modulates b-cell stress responses and normoglycaemia
Catharina M. Alam, Jonas S. G. Silvander, Ebot N. Daniel, Guo-Zhong Tao, Sofie M. Kvarnström, Parvez Alam, M. Bishr Omary, Arno Hänninen and Diana M. Toivola* Department of Biosciences, Cell Biology, Åbo Akademi University, Tykistökatu 6A, FIN-20520 Turku, Finland Department of Surgery, Stanford University School of Medicine, Stanford, California, USA Centre for Functional Materials, Åbo Akadem...
متن کاملPersonality in piglets: Is there a difference in personality traits between pigs from different teat order positions?
متن کامل
Author's response to reviews Title:A randomized longitudinal dietary intervention study during pregnancy: effects on fish intake, phospholipids, and body composition Authors:
Marja Bosaeus ([email protected]) Aysha Hussain ([email protected]) Therese Karlsson ([email protected]) Louise Andersson ([email protected]) Lena Hulthén ([email protected]) Cecilia Svelander ([email protected]) Ann-Sofie Sandberg ([email protected]) Ingrid Larsson ([email protected]) Lars Ellegård (lasse.ellegard...
متن کامل